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Abstract 


Spatially accurate and reliable estimates from fast-growing plantations are a key factor for planning energy sup- 
ply. This study aimed to estimate the yield of biomass from short rotation willow plantations in northern Eur- 
ope. The data were based on harvesting records from 1790 commercial plantations in Sweden, grouped into 
three ad hoc categories: low, middle and high performance. The predictors included climatic variables, allowing 
the spatial extrapolation to nearby countries. The modeling and spatialization of the estimates used boosted 
regression trees, a method based on machine learning. The average RMSE for the final models selected was 0.33, 
0.39 and 1.91 (corresponding to R? = 0.77, 0.88 and 0.45), for the low, medium and high performance categories, 
respectively. The models were then applied to obtain 1x1 km yield estimates in the rest of Sweden, as well as 
for Norway, Denmark, Finland, Estonia, Latvia, Lithuania and the Baltic coast of Germany and Poland. The 
results demonstrated a large regional variation. For the first rotation under high performance conditions, the 
country averages were as follows: >7 odt ha! yr! in the Baltic coast of Germany, >6 odt ha”! yr! in Den- 
mark, >5 odt ha! yr~! in the Baltic coast of Poland and between 4-5 odt ha™! yr“! in the rest. The results of 
this approach indicate that they can provide faster and more accurate predictions than previous modeling 
approaches and can offer interesting possibilities in the field of yield modeling. 
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Introduction 


How much biomass can be produced from fast-growing 
plantations in a given area? Fast-growing plantations 
are an alternative feedstock of wood biomass for the 
emerging energy sector and related industries. Such 
plantations are based on fast-growing woody species 
(such as willow) generally established on agricultural 
land, being intensively managed. The expected life span 
of a willow plantation is considered to be about 
25 years, and the same plantation can be harvested sev- 
eral times, with rotations (cutting cycles) from 3 to 
6 years (e.g., Rahman et al., 2014). 

At present, Sweden provides a good basis for com- 
mercial experience: fast-growing willow plantations 
have been cultivated at commercial level since the 
1980s, particularly willow, making Sweden the leader in 
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Europe both in long-term experience and total area 
planted, which entails c. 13 000-16 000 ha (Mola- 
Yudego & Gonzalez-Olabarria, 2010). In addition, Den- 
mark currently entails 5700 ha (Jørgensen et al., 2014), 
England c. 2500 ha (DEFRA, 2014) and Germany c. 
4000-5000 ha (Wuhlisch, 2012), whereas there are ambi- 
tious goals for their expansion, for example: Poland, 
aiming at 170 000 ha to be planted with energy crops 
(Kunikowski et al., 2005), or UK, planning 350 000 ha 
by 2020 (DEFRA, 2007), although these levels of ambi- 
tion may be subject to changes, as the Swedish example 
shows (Mola-Yudego & Gonzalez-Olabarria, 2010). 
Planning the expansion of these plantations requires 
accurate and updated information concerning both cur- 
rent and potential yield, a prerequisite for a successful 
development of bioenergy markets based on energy 
crops (Mola-Yudego et al., 2014). At local level, produc- 
tivity estimates are required for planning wood fuel 
supply chains, for the location of bioenergy plants 
(e.g., Mola-Yudego & Pelkonen, 2011), for profitability 
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analysis (e.g., Toivonen & Tahvanainen, 1998; Rosen- 
qvist & Dawson, 2005; Dimitriou & Rosenqvist, 2011) 
and for the general logistics and management associated 
to biofuels, among others. At national level, producti- 
vity estimates are needed for the construction of scenar- 
ios (e.g., Bauen et al., 2010), for the implementation of 
policy incentives (e.g., Mola-Yudego & Pelkonen, 2008), 
or even for environmental assessment (e.g., Gonzalez- 
Garcia et al., 2012), among others. 

Methods to predict plantation productivity have been 
traditionally based on regression models, where yield 
was predicted as a function of different parameters, fol- 
lowing pre-established equations. To address the spatial 
component necessary to predict potential yield in differ- 
ent areas, these parameters have been often related to 
climate. Indeed, temperature and precipitation have 
been considered the most important factors for the 
growth of willow plantations (Perttu, 1999), and several 
of the initial studies on willow plantation growth have 
modeled yields based on climatic variables (e.g., Nilsson 
& Eckersten, 1983; Perttu et al., 1984). Also climate- 
based models were used in Sweden to model the poten- 
tial productivity at spatial level of willow plantations 
(Lindroth & Bath, 1999), resulting in productivity maps 
based on the linear relationship between yield and pre- 
cipitation during the growing season. 

However, these methods present several modeling 
limitations, as in many cases, the relationships between 
the variables used as predictors are complex, presenting 
several interactions, and the use of predefined relations 
(i.e., linear regression) can lose predictive power. More 
recent approaches have considered the inherent spatial 
component of yield prediction, using more ambitious 
modeling techniques. Aylott et al. (2008) produced esti- 
mates based on partial least squares, aiming at mapping 
the plantations productivity in UK. In this case, the 
modeling approach required detailed data of each plan- 
tation’s growth and management activities, which were 
obtained from a network of well-studied experimental 
trials. Mola-Yudego (2010) produced estimates based on 
k-nn imputation methods (see, e.g., Kilkki & Paivinen, 
1987; Muinonen & Tokola, 1990; Tomppo, 1990; Tokola 
et al., 1996), aiming at mapping the plantations’ produc- 
tivity in Sweden. Following this approach, the variables 
from a specific area are predicted as a weighted average 
of the spectrally closest plantations (which are defined 
as nearest neighbors, nn) and the feature spectrum is 
defined by a vector of climatic variables. In this case, no 
detailed data of each plantation’s growth were needed, 
but relied on a large pool of plantations to get stable 
and accurate estimates. However, the nature of the 
method makes it difficult to model the specific relations 
between variables and yields and limits its potential for 
extrapolation to those areas outside the sampled data. 


In this context, the aim of this study was to provide 
accurate estimates of productivity for fast-growing 
willow plantations, spatially extrapolating a large sample 
of Swedish plantations to nearby areas in northern Eur- 
ope, using climatic data. The results of this approach can 
be applied, among others, to the policy and economic 
considerations associated with wood supply derived 
from energy crops, as well as their future development. 


Materials and methods 


Description of the data 


The data set used in the calculations was based on harvesting 
records from the first rotation of 1790 willow plantations in 
central and south Sweden (see Mola-Yudego, 2010). The 
records therefore correspond to harvest aboveground leafless 
biomass. The field measurements were provided by Lantman- 
nen Agroenergi AB. Data with inconsistent records (e.g., miss- 
ing digits in the coordinates) or lacking information regarding 
the area planted or the location were excluded from the data 
set. All plots were georeferenced to a 1 km precision. The aver- 
age yield was calculated by dividing the total harvested bio- 
mass by area planted and the number of years of the rotation. 
The plantations were cut after the first growing season after 
planting to promote sprouting (cutback). The data used 
included 7753 ha planted during the period 1986-2005 in the 
area defined from 55°20'N to 61°29'N and from 11°33/E to 
18°56'E. The average size of the plantations was 4.3 ha (SD: 
4.2 ha). 

The climatic data were based on the climate layers calculated 
for northern Europe (Hijmans et al., 2004; WorldClim database 
version 1.4). These data consist of a set of grid maps resulting 
from an interpolation process of temperature and precipitation 
averages (Hijmans et al., 2005), based on the reference period 
1960-1990, to assess the average climatic conditions of the area. 

The maps used in this study had a 30-s spatial resolution 
(which provides ~1 km precision). To link the climatic vari- 
ables with the ground data, the maps were projected from the 
originally projected datum into the same coordinate system as 
the yield data (UTM, zone 33N). The precision of the interpo- 
lated climatic variables was 0.1° C for temperature and 1 mm 
for precipitation. The monthly averages of maximum, mean 
and minimum temperatures and precipitation (referred as 
Tmax, Tmean, Tmin and P, respectively) were obtained for each 
plantation. 


Statistical modeling approach 


The modeling approach was based on boosted regression trees 
(BRT). This approach combines statistical and machine learning 
techniques aiming at the improvement of the performance of a 
single model by fitting many models and combining them for 
prediction (Schapire, 2003). Besides the selection of the vari- 
ables to be included in the models, common to any modeling 
approach, BRT requires additional parameters to be calibrated. 
BRT models are defined by different parameters: number of 
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trees, learning rate (or shrinkage, related to the reduction of the 
impact of any additional tree), bag (random fraction of the 
residuals is selected to build the tree, per iteration) and number 
of interactions between variables. 

The models and statistical analysis were developed in R ver- 
sion 3.2.0 (R development core team, 2014). The BRT models 
were based on the dismo extension of the cpm package (Ridge- 
way, 2006), developed by Elith et al. (2008). 

The predicted variable was the mean annual growth per hec- 
tare, expressed as oven dry tonnes per hectare and year (odt 
ha! yr~!). Due to the high variability of the yields resulting 
from different management practices (Mola-Yudego & Aron- 
sson, 2008; Mola-Yudego, 2010), the plantations were classified 
according to their local performance, using municipalities as a 
unit of analysis. For each municipality, the plantations were 
ranked according to their measured yields and then grouped in 
three categories made of approximately an equal number of 
plantations (Fig. 1). There were 119 municipalities with data, 
with an average of 15 plantations each. 

The categories were described as follows: high performance, 
medium performance and low performance (Table 1), corre- 
sponding to the upper third of yield level, the middle third 
and the lower third at each municipality. The average estab- 
lishment year was 1994 for all three categories. Inside the same 
municipality, plantations of the category medium were estab- 
lished on average +0.09 years (P-value: 0.079) more recently 
than those of the low, and plantations of the category high were 
established +0.29 years (P-value: 0.622) more recently than 
those of the medium and +0.47 years, (P-value: 0.032) more 
recently than those of the low. There were not significant differ- 
ences concerning the climatic profile between categories in the 
same municipality. 


Distribution of yield levels 


0.8 


Density 


Yield 


Fig. 1 Distribution of the commercial willow plantations 
(n = 1790) in Sweden used in the models. The groups corre- 
spond to a classification of local performance at municipality 
level: low, middle and high performance. Density refers to the 
probability density of the counts. 


The climatic variables selected as predictors were chosen 
following Mola-Yudego (2011la). The main criteria were 
defined to reflect the influence of climatic characteristics on 
yield based on existing literature, to present minimal bias 
and minimal root-mean-square error (RMSE) and to avoid 
excessive multicorrelation. An additional criterion derived 
from the previous study was to remove variables whose 
empirical constant p was close to 0 when optimized for 
prediction (see Mola-Yudego, 2011a,b). A total of six climatic 
variables were included in the models: Tmax2 (February), 
Tmax7 (July), Tmax8 (August), Tmin10 (Oct), Tmean5 
(May) and Psum (sum of the precipitation from May to 
September). 

The combination of different number of interactions 
between the selected variables, the required number of trees, 
learning rates and bags would result in a large number of 
potential models. Therefore, several models were tested 
sequentially, being the learning rates fixed at: 0.02, 0.01, 
0.008, 0.005 and 0.001; the bag at: 0.2, 0.5 and 0.8; and the 
number of interactions at: 1, 2, 4, 6, 8, 10, 12, 14 and 16. 
The number of trees was obtained through optimization (see 
Ridgeway, 2006): first the BRT models were built with the 
default 10-fold cross-validation, and this procedure was used 
to estimate the optimal number of trees to be included. The 
final models were finally built on the full data set, using 
the number of trees identified as optimal. This resulted in 
432 models. Among those, the model selection was based on 
a double criterion: the model should have a high predictive 
power and should present stable predictions (small varia- 
tions in the parameters defining the model should not result 
in markedly different predictions). These two criteria were 
evaluated through the RMSE (root-mean-squared error of the 
predictions vs. the observed data) and the RMSD (root- 
mean-squared deviation), respectively. The RMSD was 
defined to assess the stability of the predictions: Each model 
was run in 6500 points generated on agricultural land in 
northern Europe. Therefore, the RMSD of those predictions 
was calculated for each model, being defined as follows: 


Table 1 Descriptors for the three categories defined. N: num- 
ber of plantations. N (municipality): average number of planta- 
tions per municipality. Yield refers to the harvested first 
rotation (cutting cycle). Year Est. refers to the year when the 
plantation was established. Relative difference refers to the differ- 
ences in year of establishment among categories inside the 
same municipality. Numbers in parenthesis refers to the stan- 
dard deviations 


Low Medium High 
performance performance performance 
N 578 620 592 
N 4.9 5.2 5.0 
(municipality) 
Yield 1.23 (0.57) 2.47 (0.95) 4.73 (2.45) 
(odt ha™! yr) 
Year Est. 1994.1 (2.395) 1994.07 (2.398) 1994.33 (2.583) 
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a. U. 2 
RMSD = a yi) 


(1) 


where, y; is a prediction for a point using a given model, y; is 
the mean of the predictions of all models for point I and n the 
total number of models considered for each performance cate- 
gory. After the evaluation, prediction maps were produced for 
northern Europe, in areas geographically and climatically simi- 
lar to Sweden. The predictions were restricted to those areas 
defined as agricultural land using the Corine 2000 classification 
(EEA, 2000), 250 m resolution. 

Finally, for each country, an estimate of the average annual 
productivity for the estimated life span of the plantations was 
calculated. The calculation was based on Mola-Yudego (2010) 
where yield estimates for the first cutting cycle are used as a 
reference value. The cutting cycle length was assumed to be 
4 years, and it considered five rotations. The increments of pro- 
ductivity along the rotations were estimated based on Mola- 
Yudego & Aronsson (2008). The average annual yield was then 
calculated by dividing the accumulated production by the total 
number of years of the five rotations, plus one, to include the 


initial year for cutback (in total: 21 years); these values were 
the basis for estimating of the potential energy produced under 
different percentages of available arable land. 


Results 


Of the 432 models constructed, 62 could not be calcu- 
lated, particularly those with high learning rate (0.02), 
especially when there were many interactions and 
in the high performance category. In total, 142, 135 
and 93 models were considered, for the low, middle 
and high performing categories (Fig. 2). The results 
showed that models with few interactions between the 
variables resulted in low RMSE (Fig. 3). The average 
RMSE for all the models presented was 0.40, 0.46 and 
1.98 (corresponding to R* = 0.64, 0.75 and 0.36) for the 
categories of low, medium and high performance, 
respectively. 

In general, the overall performance of the predictions 
was better for the medium and low productivity planta- 
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Fig. 2 Performance of the tested models (Root-mean-square error) and stability of the predictions (Root-mean-square deviation). 
Each point represents the overall results for one model. RMSD is calculated as the root mean squared of the differences in 6500 test 
points between predictions for a given model and the average predictions from all models. (a) Lowest performance (n = 142), (b) mid- 
dle performance (n = 135), (c) highest performance category (n = 93). Color gradient: light = 1 interaction, darkest = 16 interactions. 
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Fig. 3 Root-mean-square error (RMSE) as a function of the number of interactions. (a) Lowest performance, (b) middle performance, 


(c) highest performance category. 
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tions. The final models selected included the following: 
14 interactions, 0.005 learning rate and 0.5 bag for the 
low performance; 16 interactions, 0.02 learning rate and 
0.5 bag for the medium performance; and 16 interac- 
tions, 0.008 learning rate and 0.5 bag for the high perfor- 
mance. The coefficients of determination (R°) were 0.77, 
0.88 and 0.45, respectively (Table 2). The models failed 
to accurately predict the highest yields of the best per- 
formance group, especially above 10 odt ha™! yr™' 
(Fig. 4). 

Psum and Tmax in February presented the highest 
weights in all categories (over 40%), whereas Tmax in 
August presented the lowest (lower than 10%) (Table 2). 
Psum presented strong interactions with all the vari- 
ables related to temperature included in the models. 
The highest interactions for all categories were between 
Tmax in February and Psum, and Tmin in October and 
Psum. In the case of middle performance, an additional 


Table 2 Weights for each variable (%), estimates of model’s 
Root-mean-squared error and coefficient of determination (R?) 
of the predictions, for every performance group (Low: low per- 
formance, Medium: middle performance, High: upper perfor- 
mance) 


Variable Low Medium High 
Tmax2 24.65 23.87 21.52 
Tmax7 14.48 16.45 10.77 
Tmax8 8.11 6.61 8.25 
Tmin10 14.91 15.05 21.06 
Tmean5 14.87 14.68 18.24 
Psum 22.95 23.32 20.12 
RMSE 0.33 0.33 1.87 


R? 0.77 0.88 0.45 


Psum: aggregated precipitation May to September; T, tempera- 
ture; Max, maximum; Min, minimum, numbers correspond to 
the calendar months. 


strong interaction was between Tmax in July and Psum 
(Table 3). 

The effect of the variables on yield was different for 
the different variables and performance categories. In 
general, precipitation had a positive effect on yield, for 
all performance groups. Tmax in February, Tmean in 
May and Tmin in October had a positive effect, whereas 
Tmax in July presented a negative effect. Tmax in 
August presented a positive effect for the middle per- 
formance group (Fig. 5). It must be taken into account 


Table 3 Variable interactions, for every performance 


Variable Tmax2 Tmax7 Tmax8 Tminl0 Tmean5 Psum 
Low 
Tmax2 0 1.62 0.93 7.3 3.24 6.94 
Tmax7 0 0 0.74 0.95 0.96 1.86 
Tmax8 0 0 0 0.76 2.01 1.25 
Tmin10 0 0 0 0 0.63 5.94 
Tmean5 0 0 0 0 0 3.79 
Psum 0 0 0 0 0 0 
Medium 
Tmax2 0 3.02 1.83 8.74 28.62 30.64 
Tmax7 0 0 1.4 2.39 1.83 21.02 
Tmax8 0 0 0 5.2 0.82 5.36 
Tmin10 0 0 0 0 2.86 6.51 
Tmean5 0 0 0 0 0 15:72 
Psum 0 0 0 0 0 0 
High 
Tmax2 0 5.89 4.78 18.45 17.49 18.28 
Tmax7 0 0 4.52 4.75 1.18 5.09 
Tmax8 0 0 0 1:7 3.63 2.05 
Tmin10 0 0 0 0 3.52 27.19 
Tmean5 0 0 0 0 0 9.03 
Psum 0 0 0 0 0 0 


Psum: aggregated precipitation May to September; T, tempera- 
ture; Max, maximum; Min, minimum, numbers correspond to 
the calendar months. 
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Fig. 4 Measured and predicted yield for the commercial fast growing willow plantations, according to the models. (a) Lowest perfor- 


mance, (b) middle performance, (c) highest performance group. 
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that the partial dependence is also defined by the inter- 
actions among variables (Table 3). 

The resulting predictions were aggregated by region 
and country, to estimate the potential area under differ- 
ent productivity categories (Fig. 6). It must be taken into 
account that in the northernmost regions, there is scarce 
agricultural land, often located nearby the coastal areas 
in the best climatic conditions. This makes the regional 
averages of Lapland (Finland), Finnmark (Norway) and 
Upper and Middle Norrland (Sweden) to be high when 
aggregated at region level, compared to other more 
southern regions. 

The analysis of the stability of the estimates was eval- 
uated through the standard deviation of the predictions 
of the models for each of 6500 random points (Fig. 7). 
The models show consistency concerning the predic- 
tions in Sweden, Finland, Estonia and most of Latvia. 
However, small changes in the model parameters 
resulted in larger differences in the predictions for the 
Western areas of Jutland (Denmark) as well as the west 
of Lithuania, and the Eastern parts of Northern Poland, 
especially concerning the best performance group. 


Concerning the best performance category, the mean 
for the first rotation by country (Fig. 8) ranged from 4.1 
odt ha! yr! (Finland) to 7.1 odt ha! yr! (Northern 
Germany). For the whole area, the average for the first 
rotation for this category was 4.8 odt ha! yr (SD 
0.98). However, it must be taken into account that there 
is a strong spatial variability, as most of the countries 
showed large regional differences (Fig. 9). When consid- 
ering the 10% agricultural land with the highest willow 
productivity (1.5 x 10° ha), the average yield for the 
area studied was 6.5 odt ha’! yr! for the first rotation. 


Discussion 


This study aims at estimating the spatial distribution of 
production of biomass for energy from short rotation 
willow plantations by modeling their potential produc- 
tivity based on climatic variables. The data were based 
on harvest records for the first rotation from 1790 com- 
mercial plantations for the period 1989-2005, and it has 
been extensively used in the past for modeling purposes 
(e.g., Mola-Yudego & Aronsson, 2008; Mola-Yudego, 
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Fig. 5 Partial dependence plots of the predictors in the model, for the performance categories (lower, middle, upper). The tempera- 
tures are expressed in degrees Celsius (x10), and precipitation in mm of rainfall. Yield: average annual harvested yield of willow 
plantations. Psum: aggregated precipitation May to September, T: Temperature, Max: maximum, Min: minimum, numbers corre- 


spond to the calendar months. 
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(a) 


(b) 


(c) 


Fig. 6 Yield estimates for willow plantations in the agricultural areas of northern Europe. Darker colours indicate high productivity 
areas. (a) Lowest performance, (b) middle performance, (c) highest performance group. Left, estimates at 1 x 1 km resolution (see 
Supporting Information). Right, averages by regions. The average includes predictions only on agricultural land. Therefore, predic- 
tions in the northern latitudes are based on the limited amount of land available. 


2011a,b). This data pool presents a realistic basis for 
yield estimates, as the predictions relate to the final 
amount of biomass that will be effectively utilized; for 
example, Sevel et al. (2012) estimated differences 
between nondestructive and harvested observations to 
be around 1.2 odt ha“! yr~', and Searle & Malins (2014) 
and Mola-Yudego et al. (2015) observed that records 
from, for example, experimental trials tend to overesti- 


mate yields especially when the plots are small. The 
large amount of data available allowed the inclusion of 
almost 60% of the whole area planted with willow for 
bioenergy in Sweden, which enhances the reliability of 
the estimates. 

Nevertheless, a disadvantage of the data used was 
the lack of detailed information concerning the manage- 
ment practices performed by the farmers, as well as 
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specific soil records from the plantations. Although 
some authors have considered that at large, tempera- 
ture and precipitation are the most important factors 
for the growth of willow plantations (e.g., Perttu, 
1983, 1999), several studies have been demonstrating 
the important role than soil type, in addition to cli- 
mate, plays in yield performance (Aylott et al., 2008). 
In this line, Sevel et al. (2012) and Larsen et al. (2014) 
pointed out that soil, clone and its interaction are key 
variables to model the productivity of willow planta- 
tions. In this study, the explicit inclusion of soil vari- 
ables presented several inconveniences as there is 
limited information available concerning the soil tex- 
tures with the necessary spatial resolution and conti- 
nuity to be included in the models for the studied 
countries. Similarly, the varieties used in the planta- 
tions play an important role in their yield perfor- 
mance (e.g., Lindegaard et al., 2001) but detailed 
information concerning the clones planted in most of 
the plantations was also not available. 

The aggregation of the plantations in three perfor- 
mance categories, following Mola-Yudego (2010), was 
aimed at homogenizing the conditions inside the same 
group, thus incorporating to a certain extend manage- 
ment or local soil conditions. The municipality bound- 
aries were used as a grouping factor, to allow a 
sufficient number of plantations to make the classifica- 
tion while at the same time assuring that the same cli- 
matic conditions would be shared between categories. 
In this sense, the high performing plantations inside a 
municipality would most likely be the result of better 
management, better clones and better soil conditions 
than the lower performing ones. It must be stressed that 
farmers decide the location of the plantations and the 
soil quality where they are established (Mola-Yudego & 
Aronsson, 2008). The working assumption proposed 
that the variability between categories may be attributed 
to variables related to the intensity of management 
(among others fertilization and rotation length), the 
local soil quality, the clone used and the interactions of 
these factors, whereas the climatic variables would 
rather explain the spatial variability of yields in planta- 
tions of the same category. In fact, the performance of 
the models was lower in the high performance category, 
underlining the fact that a large part of the variability 
between plantations in this category was explained by 
variables other than climate (R° = 0.45 for the final 
model), whereas for medium performance category, a 


large part of the variability was effectively explained by 
the variables included (R? = 0.88). It must be taken into 
account that it is an ad hoc characterization based on 
final performance, and other factors may have affected 
yield (e.g., early frosts, heavy rains or moose browsing, 
among others). 

Moreover, yield levels are not fixed and can be 
increased along time through new varieties and man- 
agement techniques (Mola-Yudego, 2011b), thus to 
make a proper use of the predictions provided, they 
should be taken as a reference threshold defined by cli- 
mate for the period studied, rather than final predic- 
tions. In this sense, the models presented can be 
recalibrated once new data are available or can be trea- 
ted as a value based on climate that could incorporate 
both soil or clone factors by adding scenarios, correction 
factors in specific areas or an improvement factor based 
on trends (e.g., Mola-Yudego, 2010). 

Several initial studies have modeled the yields in 
Sweden based on climatic variables (e.g., Nilsson & Eck- 
ersten, 1983). Lindroth & Bath (1999) proposed a 
semimechanistic model to calculate plantations’ yield as 
a linear function of precipitation during the growing 
season. The application of the model provided maps of 
potential productivity for central and southern Sweden, 
but one disadvantage was that the models resulted in 
higher yield expectations than shown by empirical mea- 
surements based on the same commercial plantations 
(Mola-Yudego & Aronsson, 2008; Mola-Yudego, 2010) 
and that the method did not offer flexibility as the rela- 
tionship between precipitation and yield was set to be 
linear. 

The modeling approach taken in this study presents 
several advantages for modeling climatic data that may 
overcome these limitations. The use of BRT allows shap- 
ing the relationship between the variables and yield 
with almost no pre-assumptions concerning the shape 
of the relationship between the variables. This is a great 
advantage, as some of the variables and their interac- 
tions may have thresholds or limiting maxima. At the 
same time, BRT allows the inclusion of a large number 
of interactions between the variables and aims at find- 
ing those more statistically relevant, simplifying the 
modeling assumptions taken a priori. The approach has 
recently been used in productivity studies, to, for exam- 
ple, map the site index for different forest species 
(Aertsen et al., 2010) or the biomass yield of seminatural 
systems (Van Meerbeek et al., 2014). 


Fig. 7 Model sensitivity to changes in the calibration, spatially (left) and by frequency (right). Each point represents the standard 
deviation (SD) of the predictions resulting from the pool of models. Categories: (a) Lowest performance (n = 142 models), (b) middle 
performance (n = 135), (c) highest performance category (n = 93). There are 6500 predictions randomly distributed in agricultural 


land. 
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Fig. 8 Estimates for energy production assuming several percentages of available arable land (% agr land) for fast growing willow 
plantations, with regional average yields under high performance conditions. DE: northern regions (Schleswig-Holstein and Mecklen- 
burg-Vorpommern), PL: northern regions (Warminsko Mazurskie, Województwo pomorskie and Zachodniopomorskie). DK: referred 
to 2012, the rest referred to 2013. NO: cultivated land, Statistics Norway (2013). The yield is estimated harvestable biomass during the 
whole lifespan of a plantation, including one initial year for cut-back and assuming high performance plantations established along 


the country. 


In addition to its predictive power, the method is 
suitable for interpretation. The results showed that there 
was a strong effect of precipitation and temperature 
during early summer, which was expected and is in line 
with previous work (Lindroth & Bath, 1999). The mini- 
mum temperature in October had a strong and positive 
effect as it was found in Finland in Tahvanainen & 
Rytkonen (1999). The negative effects of the maximum 
temperatures in July may be related to higher evapo- 
transpiration and shortage of rainfall, given the high 
water demands of willow (Linderson et al., 2007). 

The models improved existing literature from com- 
mercial plantations, with higher prediction power than 
other studies with the same data (e.g., Mola-Yudego, 
2010, 2011a,b), while at the same time being able to deli- 
ver high resolution estimates at 1 x 1 km, although it 
was observed than the BRT models had problems pre- 
dicting values in the highest ranges, underestimating 
the most productive plantations. In general, the aver- 
ages agreed with previous studies based on commercial 


data: In Denmark, the averages found for the best per- 
formance category (6.5 odt ha’ yr) agree with the 
estimates of Sevel et al., 2012 (5.2-8.8 odt ha”! yr). In 
Sweden, the average and spatial estimates were match- 
ing the same spatial pattern Mola-Yudego (2011a,b) 
which used the same data set but with a KNN approach. 
In this case, the estimates are lower than in Mola- 
Yudego (2010) as the method could not incorporate the 
trends due to yield improvements along time. In Fin- 
land, the average was similar to Tahvanainen & 
Rytkonen (1999), although their study referred to stand- 
ing biomass by nondestructive methods. 

The predictions were based on the first harvest, and 
once plants have developed a root system the following 
harvests are substantially higher, which is confirmed in 
the literature (Hoffmann-Schielle, 1995, Labrecque & 
Teodorescu, 2003; Nordh, 2005). In this study, the aver- 
age yield for the whole life span of the plantations was 
estimated using the predicted yield of the first rotation 
as reference value and then extrapolated using the 
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Fig. 9 Areas and potential productivity of short rotation willow plantations on agricultural land in the Nordic area. The areas are 
presented aggregated under different productivity values, for the low, medium and high performance groups, according to the esti- 


mations. 


results of Mola-Yudego & Aronsson (2008) and Mola- 
Yudego (2011b). However, future research must be 
addressed to study the climatic effects during an 
extended period, including the specific effects that the 
climatic variables may have during the second and sub- 
sequent rotations. Also, it must be noticed that the final 
calculation of the whole life span included the first year 
(cutback) which slightly reduced the average yield. 
Concerning the spatial accuracy of the predictions, 
the method applied can present the risk of overfitting 
when the models become too complex. One of the solu- 
tions was the inclusion of a source of stochasticity in the 
cross-validation used to assess and fit the parameters 
(Schonlau, 2005) and the use of random points helped 
to assess the stability of the predictions. The main goal 
was that, if the parameters used in the models are 
slightly changed, the resulting predictions should be 
consistent. The analysis of this spatial consistency 
showed that the west areas of Jutland in Denmark, as 
well as the west part of Lithuania and Latvia, and some 
areas in Poland are particularly sensitive to the calibra- 
tion parameters and indicate that the yield extrapolation 
based solely on climatic variables and data from Swe- 


den may need additional data for a proper calibration, 
perhaps related to soil conditions in the area. 

The predictions were restricted to agricultural land, 
and the regional averages were therefore based on pre- 
dictions for these areas (i.e., excluding forest lands, non- 
productive land, lakes and rivers). This resulted in high 
regional values for northernmost areas (e.g., Lapland, 
Norrlands or Finnmark). As agriculture land is scarce 
on northern latitudes and located mainly in favorable 
conditions nearby the coast, the resulting averages for 
those counties are high as are based in a small sample 
distributed unevenly. Although those northern areas, in 
general, present the most adverse climatic conditions to 
willow establishment, attempts to develop highly pro- 
ductive clones with frost tolerance for those areas have 
been done since the late 1980s (Lumme & Tormala, 
1988), and relatively high yields cannot be excluded as 
a future scenario. 

Finally, the country estimates considered the energy 
produced under different percentages of agricultural 
land up to 30%, for reference. It is difficult to estimate 
for the whole region the area that will eventually be 
dedicated to fast-growing plantations for energy. As a 
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reference, however, Aust ef al. (2014) suggested that 
considering all ecological, ethical, political and technical 
restrictions, as well as future climate predictions, 5.7% 
of cropland would be a suitable percentage for Ger- 
many. 

The area studied shares many climatic and geographi- 
cal features with the areas in Sweden already cultivated 
with fast-growing plantations. In fact, for most of Den- 
mark and the west and southernmost cultivation zones 
in Finland, the Swedish experience concerning willow 
varieties and methods can probably be directly trans- 
ferred. In the case of Norway, there is good potential in 
many parts of the country. Despite the country’s overall 
limitations in agricultural land, areas located in the Øst- 
fold and Akerhus present good conditions for the estab- 
lishment of plantation schemes: the results show high 
potential yields, share geographical proximity and cli- 
matic similarity with Sweden allowing the interchange 
of varieties, there is agricultural land available (c. 16% 
of the land, Statistics Norway, 2013), and there is a 
potential demand of wood chips for bioenergy due to 
the high density of population. In the Baltic countries 
and Baltic coast of Poland and Germany, the results 
show good potential yields that can translate into signif- 
icant figures of energy production. Additionally, the 
enlargement of cultivated areas in those zones can 
result in rapid yield improvements once a critical mar- 
ket size is reached (Mola-Yudego et al., 2014). 

Indicatively, in the case of Sweden, estimates show 
that 1% and 5% of agricultural land could translate into 
1.5% and 7.2% of the total annual heat consumption, 
respectively (annual heat estimated in 47 TWh, Swedish 
Energy Agency, 2013). However, there are several com- 
ponents of potential supply of biomass crops beyond 
yield estimation that can play an important role in devel- 
oping the potential supply (e.g., the current agricultural 
systems, the role of relative prices, input costs and policy 
developments) and must be taken into account. 
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